ABSTRACT

Purpose. Panitumumab, a fully human anti–epidermal growth factor receptor (EGFR) monoclonal antibody that improves progression-free survival (PFS), is approved as monotherapy for patients with chemotherapy- refractory metastatic colorectal cancer (mCRC). The Panitumumab Randomized Trial in Combination With Chemotherapy for Metastatic Colorectal Cancer to Determine Efficacy (PRIME) was designed to evaluate the efficacy and safety of panitumumab plus infusional fluorouracil, leucovorin, and oxaliplatin (FOLFOX4) versus FOLFOX4 alone as initial treatment for mCRC.

Patients and Methods. In this multicenter, phase III trial, patients with no prior chemotherapy for mCRC, Eastern Cooperative Oncology Group performance status of 0 to 2, and available tissue for biomarker testing were randomly assigned 1:1 to receive panitumumab-FOLFOX4 versus FOLFOX4. The primary end point was PFS; overall survival (OS) was a secondary end point. Results were prospectively analyzed on an intent-to-treat basis by tumor KRAS status.

Results. KRAS results were available for 93% of the 1,183 patients randomly assigned. In the wild-type (WT) KRAS stratum, panitumumab-FOLFOX4 significantly improved PFS compared with FOLFOX4 (median PFS, 9.6 v 8.0 months, respectively; hazard ratio [HR], 0.80; 95% CI, 0.66 to 0.97; P 0.02). A nonsignificant increase in OS was also observed for panitumumab-FOLFOX4 versus FOLFOX4 (median OS, 23.9 v 19.7 months, respectively; HR, 0.83; 95% CI, 0.67 to 1.02; P 0.072). In the mutant KRAS stratum, PFS was significantly reduced in the panitumumab-FOLFOX4 arm versus the FOLFOX4 arm (HR, 1.29; 95% CI, 1.04 to 1.62; P 0.02), and median OS was 15.5 months versus 19.3 months, respectively (HR, 1.24; 95% CI, 0.98 to 1.57; P 0.068). Adverse event rates were generally comparable across arms with the exception of toxicities known to be associated with anti-EGFR therapy.

Conclusion. This study demonstrated that panitumumab-FOLFOX4 was well tolerated and significantly improved PFS in patients with WT KRAS tumors and underscores the importance of KRAS testing for patients with mCRC.

INTRODUCTION

Colorectal cancer (CRC) is the third most common cancer among men and women in the United States. Worldwide, there are more than one million new cases of CRC each year. On the basis of its role in the pathogenesis of CRC, the epidermal growth factor receptor (EGFR) has proven to be a clinically meaningful target for monoclonal antibodies (mAbs) with efficacy established in all lines of treatment of metastatic CRC (mCRC). Panitumumab is a fully human mAb targeting the EGFR. Retrospectively analyzed studies identified KRAS mutation in tumors as a negative predictive factor for panitumumab and cetuximab for improved response rate (RR), progression-free survival (PFS), and overall survival (OS). In September 2007, a prospectively defined, retrospective analysis of the pivotal phase III study of panitumumab as monotherapy in the mCRC setting provided evidence that clinical benefit was specific to patients with wild-type (WT) KRAS tumors. The Panitumumab Randomized Trial in Combination With Chemotherapy for Metastatic Colorectal Cancer to Determine Efficacy (PRIME) is an open-label, randomized, multicenter, phase III trial prospectively investigating panitumumab plus infusional fluorouracil, leucovorin, and oxaliplatin (FOLFOX4) versus FOLFOX4 alone as first-line treatment for mCRC in patients with WT KRAS tumors. Originally designed to compare the treatment effect in all randomly assigned patients, the trial was amended to focus on prospective hypothesis testing in the WT KRAS stratum.

PATIENTS AND METHODS

Study Design and Treatment Schedule. This is an open-label, multicenter, phase III trial that compared the efficacy of panitumumab-FOLFOX4 with FOLFOX4 alone in patients with previously untreated mCRC according to tumor KRAS status. Patients were randomly assigned 1:1 to receive either panitumumab-FOLFOX4 or FOLFOX4. Panitumumab was administered intravenously (IV) every 2 weeks on day 1 before FOLFOX4 chemotherapy.FOLFOX4 was administered every 2 weeks. Treatment was administered until progression or unacceptable toxicity. Objective tumor response was evaluated by blinded central radiology review using modified Response Evaluation Criteria in Solid Tumors (RECIST) in all patients with baseline measurable disease per central review. Patients were evaluated every 8 weeks until progression. Responses were confirmed at least 4 weeks later. Patients were observed for safety 30 days after the last study drug administration and for survival every 3 months. Adverse events (AEs) were graded using the Common Terminology Criteria for Adverse Events (version 3.0) with modifications for specific skin- and nail-related toxicities.

STATISTICS ANALYSIS

Import Libraries

library(readxl)
library(ggplot2)
library(ggpubr)
library(survival)
library(survminer)
## 
## Attaching package: 'survminer'
## The following object is masked from 'package:survival':
## 
##     myeloma
library(ggsurvfit)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(magrittr)
library(gridExtra)
## 
## Attaching package: 'gridExtra'
## The following object is masked from 'package:dplyr':
## 
##     combine

Read the Excel File and all the Sheets

Note: change the path with your own

colon1 <- read_excel("/Users/ariannarigamonti/Desktop/colon.xlsx", sheet = "adae_pds2019")
colon2 <- read_excel("/Users/ariannarigamonti/Desktop/colon.xlsx", sheet = "adlb_pds2019")
colon3 <- read_excel("/Users/ariannarigamonti/Desktop/colon.xlsx", sheet = "adls_pds2019")
colon4 <- read_excel("/Users/ariannarigamonti/Desktop/colon.xlsx", sheet = "adpm_pds2019")
colon5 <- read_excel("/Users/ariannarigamonti/Desktop/colon.xlsx", sheet = "adrsp_pds2019")
colon6 <- read_excel("/Users/ariannarigamonti/Desktop/colon.xlsx", sheet = "adsl_pds2019")
colon7 <- read_excel("/Users/ariannarigamonti/Desktop/colon.xlsx", sheet ="biomark_pds2019")

Merge the Interesting Variables in One Final Dataframe

col1_col6_merged <- merge(colon1, colon6, by.x = "SUBJID")
col1_col6_col7_merged <- merge(col1_col6_merged, colon7, by.x = "SUBJID")
col6_col7_merged <- merge(colon6, colon7, by.x = "SUBJID")

df_final_pre <- unique(col1_col6_col7_merged)

df_final <- df_final_pre %>%
  group_by(SUBJID, AEPT) %>%
  arrange(desc(AESEVCD)) %>%
  slice(1) %>%
  ungroup()

df_final1 <- df_final_pre %>%
  group_by(SUBJID) %>%
  arrange(desc(AESEVCD)) %>%
  slice(1) %>%
  ungroup()

Divide Patients in Mutant (mt) and Wild-Type (wt) for KRAS (Exon 2) Gene ‘BMMTR1’

mutant <- subset(col6_col7_merged, BMMTR1 == "Mutant" | BMMTR2 == "Mutant" | BMMTR3 == "Mutant" | BMMTR4 == "Mutant" |
                   BMMTR5 == "Mutant" | BMMTR6 == "Mutant" | BMMTR7 == "Mutant" | BMMTR15 == "Mutant" | BMMTR16 == "Mutant")

wild_type <- subset(col6_col7_merged, 
                    BMMTR1 == "Wild-type" & 
                      BMMTR2 == "Wild-type" & 
                      BMMTR3 == "Wild-type" & 
                      BMMTR4 == "Wild-type" &
                      BMMTR5 == "Wild-type" & 
                      BMMTR6 == "Wild-type" & 
                      BMMTR7 == "Wild-type" & 
                      BMMTR15 == "Wild-type" & 
                      BMMTR16 == "Wild-type")

Divide patients based on AE grade (1-2 and 3-4)

skin_ae <- subset(df_final1, AESOC == "Skin and subcutaneous tissue disorders")
df_grade_1_2 <- skin_ae[skin_ae$AESEVCD %in% c(1, 2), ]
df_grade_3_4 <- skin_ae[skin_ae$AESEVCD %in% c(3, 4), ]

DESCRIPTIVE STATISTICS

Frequencies Barplot of MT and WT Patients

mutant1 <- mutant
mutant1$BMMTR = "Mutant"

wild_type1 <- wild_type
wild_type1$BMMTR = "Wild-type"

df_mt_wt <- rbind(mutant1, wild_type1)

df_final1 <- merge(df_final1, df_mt_wt[, c("SUBJID", "BMMTR")], by = "SUBJID", all.x = TRUE)

freq_mt_wt <- table(df_mt_wt$BMMTR)

barplot(freq_mt_wt, 
        main = "KRAS type",
        ylab = "Frequency",
        col = c("salmon", "yellowgreen"),
        ylim = c(0,500))

# Creare un grafico a barre con ggplot2
ggplot(df_mt_wt, aes(x = BMMTR, fill = BMMTR)) +
  geom_bar() +
  labs(title = "KRAS Type", x = "KRAS Type", y = "Frequency") +
  scale_fill_manual(values = c("salmon", "yellowgreen")) +
  theme_minimal()

# Calcolare le percentuali
df_mt_wt_percent <- df_mt_wt %>%
  count(BMMTR) %>%
  mutate(Percent = n / sum(n) * 100)

# Creare un grafico a barre in percentuale con ggplot2
ggplot(df_mt_wt_percent, aes(x = BMMTR, y = Percent, fill = BMMTR)) +
  geom_bar(stat = "identity") +
  labs(title = "KRAS Type distribution", x = "KRAS Type", y = "Percentage") +
  scale_fill_manual(values = c("salmon", "yellowgreen")) +
  theme_minimal() +
  geom_text(aes(label = paste0(round(Percent, 1), "%")), vjust = -0.5) +
  theme(plot.title = element_text(hjust = 0.5))

Adverse events

# Adverse events on all patients (regarless the therapy)
freq_AESOC <- round(((table(df_final$AESOC)) / nrow(df_final)) * 100, 1)
# Crea un dataframe con le frequenze percentuali
df_pie <- data.frame(categories = names(freq_AESOC), frequencies = freq_AESOC)
# Crea la pie chart con ggplot2
ggplot(df_pie, aes(x = "", y = freq_AESOC, fill = categories)) +
geom_bar(stat = "identity", width = 1) +
coord_polar("y", start = 0) +  # Imposta il tipo di coordinate su polare
labs(title = "Therapy Adverse Events Categories", 
    fill = "Categories", 
    x = NULL, 
    y = "Percentage") +  # Rimuovi l'etichetta dell'asse x
theme_minimal() +  # Stile del tema
scale_fill_manual(values = rainbow(length(freq_AESOC)))  # Imposta i colori manualmente
## Don't know how to automatically pick scale for object of type <table>.
## Defaulting to continuous.

# Adverse events based on the therapy

# FOLFOX4 alone
folfox <- subset(df_final, TRT == "FOLFOX alone")
freq_AESOC_folfox <- round(((table(folfox$AESOC)) / nrow(folfox)) * 100, 1)
# Crea un dataframe con le frequenze percentuali
df_pie_folfox <- data.frame(categories = names(freq_AESOC_folfox), frequencies = freq_AESOC_folfox)
# Crea la pie chart con ggplot2
ggplot(df_pie_folfox, aes(x = "", y = freq_AESOC_folfox, fill = categories)) +
geom_bar(stat = "identity", width = 1) +
coord_polar("y", start = 0) +  # Imposta il tipo di coordinate su polare
labs(title = "Folfox Adverse Events", 
    fill = "Categories", 
    x = NULL, 
    y = "Percentage") +  # Rimuovi l'etichetta dell'asse x
theme_minimal() +  # Stile del tema
scale_fill_manual(values = rainbow(length(freq_AESOC_folfox)))  # Imposta i colori manualmente
## Don't know how to automatically pick scale for object of type <table>.
## Defaulting to continuous.

# FOLFOX4 alone
folfox_panitumumab <- subset(df_final, TRT == "Panitumumab + FOLFOX")
freq_AESOC_folfox_panitumumab <- round(((table(folfox_panitumumab$AESOC)) / nrow(folfox_panitumumab)) * 100, 1)
# Crea un dataframe con le frequenze percentuali
df_pie_folfox_panitumumab <- data.frame(categories = names(freq_AESOC_folfox_panitumumab), frequencies = freq_AESOC_folfox_panitumumab)

# Crea la pie chart con ggplot2
ggplot(df_pie_folfox_panitumumab, aes(x = "", y = freq_AESOC_folfox_panitumumab, fill = categories)) +
geom_bar(stat = "identity", width = 1) +
coord_polar("y", start = 0) +  # Imposta il tipo di coordinate su polare
labs(title = "Folfox + Panitumumab Adverse Events", 
    fill = "Categories", 
    x = NULL, 
    y = "Percentage") +  # Rimuovi l'etichetta dell'asse x
theme_minimal() +  # Stile del tema
scale_fill_manual(values = rainbow(length(freq_AESOC_folfox_panitumumab)))  # Imposta i colori manualmente
## Don't know how to automatically pick scale for object of type <table>.
## Defaulting to continuous.

Frequencies Barplot of Adverse Events Grade

freq_AESEVCD <- table(df_final$AESEVCD)

df_bar <- data.frame(Grade = as.numeric(names(freq_AESEVCD)), Frequency = freq_AESEVCD)
# Crea il barplot con ggplot2
ggplot(df_bar, aes(x = factor(Grade), y = freq_AESEVCD, fill = factor(Grade))) +
geom_bar(stat = "identity") +
labs(title = "Adverse Events Grade Distribution", 
    x = "Grade (1-4)", 
    y = "Frequency") +
scale_fill_manual(name = 'Grade', values = c("chartreuse1", "yellow", "orange", "red")) +  # Colori delle barre
ylim(0, 1000) +  # Imposta i limiti sull'asse y
theme_minimal() +
theme(axis.text.x = element_text(angle = 0, hjust = 1))

SURVIVAL ANALYSIS

The primary objective of this study was to assess the treatment effect of the addition of panitumumab to FOLFOX4 on PFS as initial therapy for mCRC in patients with WT KRAS tumors and also in patients with MT KRAS tumors. The study was amended to compare PFS (primary end point) with OS (secondary end point) according to KRAS status before any efficacy analyses were carried out.

Treatment effects on PFS and OS were estimated using stratified Cox proportional hazards models and the Kaplan–Meier method. An exact 95% confidence interval (CI) was estimated for a stratified odds ratio for objective response rate (ORR). The randomization stratification factors were used for the stratified analyses.

# Calcolare la mediana del tempo di sopravvivenza per tutti i pazienti (PFS)
fit_pfs <- survfit(Surv(PFSDYCR, PFSCR) ~ TRT, data = df_mt_wt)
summary(fit_pfs)$table
##                          records n.max n.start events    rmean se(rmean) median
## TRT=FOLFOX alone             411   411     411    361 346.0519  15.01651    273
## TRT=Panitumumab + FOLFOX     423   423     423    366 365.2060  16.04395    270
##                          0.95LCL 0.95UCL
## TRT=FOLFOX alone             238     290
## TRT=Panitumumab + FOLFOX     231     285
median_pfs <- summary(fit_pfs)$table
# Convertire la mediana da giorni a mesi
median_pfs_months <- round(median_pfs[,"median"] / 30.44, 1)

# Calcolare la mediana del tempo di sopravvivenza per tutti i pazienti (OS)
fit_os <- survfit(Surv(DTHDY, DTH) ~ TRT, data = df_mt_wt)
median_os <- summary(fit_os)$table
summary(fit_os)$table
##                          records n.max n.start events    rmean se(rmean) median
## TRT=FOLFOX alone             411   411     411    310 696.3368  21.81274    600
## TRT=Panitumumab + FOLFOX     423   423     423    307 695.3658  22.76780    602
##                          0.95LCL 0.95UCL
## TRT=FOLFOX alone             536     673
## TRT=Panitumumab + FOLFOX     546     672
# Convertire la mediana da giorni a mesi
median_os_months <- round(median_os[,"median"] / 30.44, 1)

# Eseguire il log-rank test per PFS
survdiff_pfs <- survdiff(Surv(PFSDYCR, PFSCR) ~ TRT, data = df_mt_wt)
print(survdiff_pfs)
## Call:
## survdiff(formula = Surv(PFSDYCR, PFSCR) ~ TRT, data = df_mt_wt)
## 
##                            N Observed Expected (O-E)^2/E (O-E)^2/V
## TRT=FOLFOX alone         411      361      351     0.282      0.55
## TRT=Panitumumab + FOLFOX 423      366      376     0.264      0.55
## 
##  Chisq= 0.6  on 1 degrees of freedom, p= 0.5
# Eseguire il log-rank test per OS
survdiff_os <- survdiff(Surv(DTHDY, DTH) ~ TRT, data = df_mt_wt)
print(survdiff_os)
## Call:
## survdiff(formula = Surv(DTHDY, DTH) ~ TRT, data = df_mt_wt)
## 
##                            N Observed Expected (O-E)^2/E (O-E)^2/V
## TRT=FOLFOX alone         411      310      305    0.0698     0.139
## TRT=Panitumumab + FOLFOX 423      307      312    0.0684     0.139
## 
##  Chisq= 0.1  on 1 degrees of freedom, p= 0.7
# Visualizzare la curva di sopravvivenza con ggsurvfit per PFS
p1 <- ggsurvplot(fit_pfs, data = df_mt_wt, pval = TRUE, risk.table = TRUE,
           title = "Kaplan-Meier Curve for mCRC PFS by treatment",
           xlab = "Follow-up time (days)", ylab = "Progression Free Survival (PFS)",
           palette = c("blue", "gold2"),
           conf.int = TRUE,
           legend.labs = c("FOLFOX4", "Panitumumab + FOLFOX4"),
           legend.title = "Treatment")

# Aggiungi la mediana al plot PFS
p1$plot <- p1$plot + 
  geom_vline(xintercept = median_pfs[,"median"], linetype = "dashed", color = "red")

# Visualizzare la curva di sopravvivenza con ggsurvfit per OS
p2 <- ggsurvplot(fit_os, data = df_mt_wt, pval = TRUE, risk.table = TRUE,
           title = "Kaplan-Meier Curve for mCRC OS by treatment",
           xlab = "Follow-up time (days)", ylab = "Overall Survival (OS)",
           palette = c("blue", "gold2"),
           conf.int = TRUE,
           legend.labs = c("FOLFOX4", "Panitumumab + FOLFOX4"),
           legend.title = "Treatment")

# Aggiungi la mediana al plot OS
p2$plot <- p2$plot + 
  geom_vline(xintercept = median_os[,"median"], linetype = "dashed", color = "red")

# Creare una tabella con le mediane di sopravvivenza in mesi
median_table <- data.frame(
  Treatment = rep(c("FOLFOX4", "Panitumumab + FOLFOX4"), 2),
  Survival_Type = c("PFS", "PFS", "OS", "OS"),
  Median_Survival_Months = c(median_pfs_months, median_os_months)
)

# Creare un grafico della tabella delle mediane
table_plot <- tableGrob(median_table, rows = NULL)

# Disporre i plot in un unico layout con la tabella a lato
grid.arrange(p1$plot, p2$plot, table_plot, ncol = 1, nrow = 3, heights = c(2, 2, 1))

# Creare la curva di sopravvivenza di Kaplan-Meier per PFS (wild-type)
fit_pfs <- survfit(Surv(PFSDYCR, PFSCR) ~ TRT, data = wild_type1)
summary(fit_pfs)$table
##                          records n.max n.start events    rmean se(rmean) median
## TRT=FOLFOX alone             169   169     169    145 351.7932   22.0711    285
## TRT=Panitumumab + FOLFOX     181   181     181    142 479.9806   28.9404    341
##                          0.95LCL 0.95UCL
## TRT=FOLFOX alone             237     337
## TRT=Panitumumab + FOLFOX     287     424
median_pfs <- summary(fit_pfs)$table
median_pfs
##                          records n.max n.start events    rmean se(rmean) median
## TRT=FOLFOX alone             169   169     169    145 351.7932   22.0711    285
## TRT=Panitumumab + FOLFOX     181   181     181    142 479.9806   28.9404    341
##                          0.95LCL 0.95UCL
## TRT=FOLFOX alone             237     337
## TRT=Panitumumab + FOLFOX     287     424
median_pfs_months <- round(median_pfs[,"median"] / 30.44, 1)
median_pfs_months
##         TRT=FOLFOX alone TRT=Panitumumab + FOLFOX 
##                      9.4                     11.2
# Visualizzare la curva di sopravvivenza con ggsurvfit
p1 <- ggsurvplot(fit_pfs, data = wild_type1, pval = TRUE, risk.table = TRUE,
           title = "Kaplan-Meier Curve for mCRC KRAS WT PFS by treatment",
           xlab = "Follow-up time (days)", ylab = "Progression Free Survival (PFS)",
           palette = c("blue", "gold2"),
           conf.int = TRUE,
           legend.labs = c("FOLFOX4", "Panitumumab + FOLFOX4"),
           legend.title = "Treatment") 

# Aggiungi la mediana al plot PFS
p1$plot <- p1$plot + 
  geom_vline(xintercept = median_pfs[,"median"], linetype = "dashed", color = "red")

# Creare la curva di sopravvivenza di Kaplan-Meier per PFS (mutant)
fit_pfs_mutant <- survfit(Surv(PFSDYCR, PFSCR) ~ TRT, data = mutant1)
median_pfs_mutant <- summary(fit_pfs_mutant)$table
median_pfs_mutant_months <- round(median_pfs_mutant[,"median"] / 30.44, 1)

# Visualizzare la curva di sopravvivenza con ggsurvfit
p2 <- ggsurvplot(fit_pfs_mutant, data = mutant1, pval = TRUE, risk.table = TRUE,
           title = "Kaplan-Meier Curve for mCRC KRAS MT PFS by treatment",
           xlab = "Follow-up time (days)", ylab = "Progression Free Survival (PFS)",
           palette = c("blue", "gold2"),
           conf.int = TRUE,
           legend.labs = c("FOLFOX4","Panitumumab + FOLFOX4"),
           legend.title = "Treatment")

# Aggiungi la mediana al plot OS
p2$plot <- p2$plot + 
  geom_vline(xintercept = median_pfs_mutant[,"median"], linetype = "dashed", color = "red")

# Creare la curva di sopravvivenza di Kaplan-Meier per OS (wild-type)
fit_os <- survfit(Surv(DTHDY, DTH) ~ TRT, data = wild_type1)
median_os <- summary(fit_os)$table
median_os_months <- round(median_os[,"median"] / 30.44, 1)

# Visualizzare la curva di sopravvivenza con ggsurvfit
p3 <- ggsurvplot(fit_os, data = wild_type1, pval = TRUE, risk.table = TRUE,
           title = "Kaplan-Meier Curve for mCRC KRAS WT OS by treatment",
           xlab = "Follow-up time (days)", ylab = "Overall Survival (OS)",
           palette = c("blue", "gold2"),
           conf.int = TRUE,
           legend.labs = c("FOLFOX4", "Panitumumab + FOLFOX4"),
           legend.title = "Treatment")

# Aggiungi la mediana al plot OS
p3$plot <- p3$plot + 
  geom_vline(xintercept = median_os[,"median"], linetype = "dashed", color = "red")

# Creare la curva di sopravvivenza di Kaplan-Meier per OS (mutant)
fit_os_mutant <- survfit(Surv(DTHDY, DTH) ~ TRT, data = mutant1)
median_os_mutant <- summary(fit_os_mutant)$table
median_os_mutant_months <- round(median_os_mutant[,"median"] / 30.44, 1)

# Visualizzare la curva di sopravvivenza con ggsurvfit
p4 <- ggsurvplot(fit_os_mutant, data = mutant1, pval = TRUE, risk.table = TRUE,
           title = "Kaplan-Meier Curve for mCRC KRAS MT OS by treatment",
           xlab = "Follow-up time (days)", ylab = "Overall Survival (OS)",
           palette = c("blue", "gold2"),
           conf.int = TRUE,
           legend.labs = c("FOLFOX4", "Panitumumab + FOLFOX4"),
           legend.title = "Treatment")

# Aggiungi la mediana al plot OS
p4$plot <- p4$plot + 
  geom_vline(xintercept = median_os_mutant[,"median"], linetype = "dashed", color = "red")

# Creare una tabella con le mediane di sopravvivenza
median_table <- data.frame(
  Treatment = rep(c("FOLFOX4", "Panitumumab + FOLFOX4"), 4),
  Survival_Type = c("PFS-WT", "PFS-WT", "PFS-MT", "PFS-MT", "OS-WT", "OS-WT", "OS-MT", "OS-MT"),
  Median_Survival_Months = c(median_pfs_months, median_pfs_mutant_months, median_os_months, median_os_mutant_months)
)

# Creare un grafico della tabella delle mediane
table_plot <- tableGrob(median_table, rows = NULL)

# Disporre i plot in un unico layout
grid.arrange(p1$plot, p2$plot, p3$plot, p4$plot, ncol = 2, nrow = 2)

grid.arrange(table_plot, ncol=1, nrow = 1)

Skin toxicity was defined as any treatment-emergent adverse event indicative of a skin disorder, representing a composite category of adverse event terms including but not limited to rash, dermatitis acneiform, pruritus, dry skin, skin fissures, and erythema. Retrospective post-hoc analyses were carried out to determine the effect of skin toxicity on efficacy end points, in- cluding PFS (by central review) and OS. A stratified Cox proportional hazards model was used to examine the relationship between worst grade skin toxicity severity (grade 2–4: grade 0–1) and PFS/OS. ORR by central review and worst grade skin toxicity was provided.

 # Creare una variabile AE per indicare il grado di AE
df_final1$AE <- ifelse(df_final1$AESEVCD %in% c(1, 2), "Low", "High")

# Creare la curva di sopravvivenza di Kaplan-Meier per PFS in base al grado di AE
fit_pfs_skin <- survfit(Surv(PFSDYCR, PFSCR) ~ AE, data = df_final1)
summary(fit_pfs_skin)$table
##         records n.max n.start events    rmean se(rmean) median 0.95LCL 0.95UCL
## AE=High     175   175     175    145 420.9534  28.19621    287     268     328
## AE=Low      433   433     433    383 353.1835  14.66345    275     234     290
median_pfs_skin <- summary(fit_pfs_skin)$table
median_pfs_months_skin <- round(median_pfs_skin[,"median"] / 30.44, 1)

# Visualizzare la curva di sopravvivenza con ggsurvfit
p1 <- ggsurvplot(fit_pfs_skin, data = df_final1, pval = TRUE, risk.table = TRUE,
                 title = "Kaplan-Meier Curve for mCRC PFS by AE grade",
                 xlab = "Follow-up time (days)", ylab = "Progression Free Survival (PFS)",
                 palette = c("blue", "gold2"),
                 conf.int = TRUE,
                 legend.labs = c("High", "Low"),
                 legend.title = "Adverse Events Grade") 

# Aggiungi la mediana al plot PFS
p1$plot <- p1$plot + 
  geom_vline(xintercept = median_pfs_skin[,"median"], linetype = "dashed", color = "red")

# Creare la curva di sopravvivenza di Kaplan-Meier per OS in base al grado di AE
fit_os_skin <- survfit(Surv(DTHDY, DTH) ~ AE, data = df_final1)
median_os_skin <- summary(fit_os_skin)$table
median_os_months_skin <- round(median_os_skin[,"median"] / 30.44, 1)

# Visualizzare la curva di sopravvivenza con ggsurvfit
p2 <- ggsurvplot(fit_os_skin, data = df_final1, pval = TRUE, risk.table = TRUE,
                 title = "Kaplan-Meier Curve for mCRC OS by AE grade",
                 xlab = "Follow-up time (days)", ylab = "Overall Survival (OS)",
                 palette = c("blue", "gold2"),
                 conf.int = TRUE,
                 legend.labs = c("High", "Low"),
                 legend.title = "Adverse Events Grade")

# Aggiungi la mediana al plot OS
p2$plot <- p2$plot + 
  geom_vline(xintercept = median_os_skin[,"median"], linetype = "dashed", color = "red")

# Creare una tabella con le mediane di sopravvivenza
median_table_skin <- data.frame(
  AE_Grade = rep(c("High", "Low"), 2),
  Survival_Type = c("PFS", "PFS", "OS", "OS"),
  Median_Survival_Months = c(median_pfs_months_skin, median_os_months_skin)
)

# Creare un grafico della tabella delle mediane
table_plot <- tableGrob(median_table_skin, rows = NULL)

# Disporre i plot in un unico layout
grid.arrange(p1$plot, p2$plot, ncol = 1)

grid.arrange(table_plot, ncol = 1, nrow = 1)

 # Creare la curva di sopravvivenza di Kaplan-Meier per PFS in base a ECOC
fit_pfs_ecog <- survfit(Surv(PFSDYCR, PFSCR) ~ B_ECOG, data = df_final1)
median_pfs_ecog <- summary(fit_pfs_ecog)$table
median_pfs_months_ecog <- round(median_pfs_ecog[,"median"] / 30.44, 1)

# Visualizzare la curva di sopravvivenza con ggsurvfit
p1 <- ggsurvplot(fit_pfs_ecog, data = df_final1, pval = TRUE, risk.table = TRUE,
                 title = "Kaplan-Meier Curve for mCRC PFS by Baseline ECOG",
                 xlab = "Follow-up time (days)", ylab = "Progression Free Survival (PFS)",
                 palette = c("blue", "gold2", "olivedrab4"),
                 conf.int = TRUE,
                 legend.labs = c("Fully Active", "In bed less than 50% of the time", "Symptoms but ambulatory"),
                 legend.title = "Baseline ECOG") 

# Aggiungi la mediana al plot PFS
p1$plot <- p1$plot + 
  geom_vline(xintercept = median_pfs_ecog[,"median"], linetype = "dashed", color = "red")

# Creare la curva di sopravvivenza di Kaplan-Meier per OS in base a ECOG
fit_os_ecog <- survfit(Surv(DTHDY, DTH) ~ B_ECOG, data = df_final1)
median_os_ecog <- summary(fit_os_ecog)$table
median_os_months_ecog <- round(median_os_ecog[,"median"] / 30.44, 1)

# Visualizzare la curva di sopravvivenza con ggsurvfit
p2 <- ggsurvplot(fit_os_ecog, data = df_final1, pval = TRUE, risk.table = TRUE,
                 title = "Kaplan-Meier Curve for mCRC OS by Baseline ECOG",
                 xlab = "Follow-up time (days)", ylab = "Overall Survival (OS)",
                 palette = c("blue", "gold2", "olivedrab4"),
                 conf.int = TRUE,
                 legend.labs = c("Fully Active", "In bed less than 50% of the time", "Symptoms but ambulatory"),
                 legend.title = "Baseline ECOG")

# Aggiungi la mediana al plot OS
p2$plot <- p2$plot + 
  geom_vline(xintercept = median_os_ecog[,"median"], linetype = "dashed", color = "red")

# Creare una tabella con le mediane di sopravvivenza
median_table_ecog <- data.frame(
  ECOG = rep(c("Fully Active", "In bed less than 50% of the time", "Symptoms but ambulatory"), 2),
  Survival_Type = c("PFS", "PFS","PFS","OS", "OS", "OS"),
  Median_Survival_Months = c(median_pfs_months_ecog, median_os_months_ecog)
)

# Creare un grafico della tabella delle mediane
table_plot <- tableGrob(median_table_ecog, rows = NULL)

# Disporre i plot in un unico layout
grid.arrange(p1$plot, p2$plot, ncol = 1)

grid.arrange(table_plot, ncol = 1, nrow = 1)

hist(df_final1$AGE, xlab='Age [years]', main='Age distribution', col = "skyblue")
summary(df_final1$AGE)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   27.00   54.00   61.00   60.74   69.00   83.00
df_final1$agecat61 <- cut(df_final1$AGE, breaks=c(0, 55, Inf), labels=c("young", "old"))
head(df_final1)
##   SUBJID                 AEPT                                  AESOC AESTDYI
## 1      3 Dermatitis acneiform Skin and subcutaneous tissue disorders      12
## 2      7         Folliculitis            Infections and infestations     245
## 3      9     Exfoliative rash Skin and subcutaneous tissue disorders       3
## 4     13                 Rash Skin and subcutaneous tissue disorders      37
## 5     15             Dry skin Skin and subcutaneous tissue disorders      72
## 6     17            Urticaria Skin and subcutaneous tissue disorders     175
##   AEENDYI AESEVCD AECONT                  TRT                 ATRT PRSURG DTHDY
## 1      29       1      N Panitumumab + FOLFOX Panitumumab + FOLFOX      Y   301
## 2     252       3      N Panitumumab + FOLFOX Panitumumab + FOLFOX      Y   339
## 3      14       2      N Panitumumab + FOLFOX Panitumumab + FOLFOX      Y   696
## 4      55       2      N Panitumumab + FOLFOX Panitumumab + FOLFOX      Y   197
## 5     157       2      N Panitumumab + FOLFOX Panitumumab + FOLFOX      Y   414
## 6     175       2      N         FOLFOX alone         FOLFOX alone      Y  1043
##   DTH PFSDYCR PFSCR LIVERMET             DIAGMONS AGE    SEX B_WEIGHT B_HEIGHT
## 1   1      54     1        Y 1.86885245901639E+16  65   Male       67      165
## 2   1     339     1        Y 2.78688524590163E+16  52 Female       53      160
## 3   1     301     1        Y     7570491803278680  42 Female       69      148
## 4   1      73     1        Y  2.0327868852459E+16  54   Male       62      160
## 5   1     414     1        Y      380327868852459  67 Female     48.8      155
## 6   0     226     0        Y 3.14754098360655E+16  57 Female     63.6      149
##                 RACE                  B_ECOG    HISSUBTY B_METANM DIAGTYPE
## 1 White or Caucasian Symptoms but ambulatory No sub-type        5    Colon
## 2 Hispanic or Latino Symptoms but ambulatory    Mucinous        3    Colon
## 3 Hispanic or Latino            Fully active No sub-type        4    Colon
## 4 White or Caucasian            Fully active       Other        6   Rectal
## 5 White or Caucasian Symptoms but ambulatory No sub-type        2   Rectal
## 6 White or Caucasian            Fully active No sub-type        2    Colon
##                BMMTNM1    BMMTR1           BMMTNM2    BMMTR2
## 1 KRAS exon 2 (c12/13)    Mutant KRAS exon 3 (c61)      <NA>
## 2 KRAS exon 2 (c12/13)    Mutant KRAS exon 3 (c61)      <NA>
## 3 KRAS exon 2 (c12/13)   Failure KRAS exon 3 (c61)      <NA>
## 4 KRAS exon 2 (c12/13) Wild-type KRAS exon 3 (c61) Wild-type
## 5 KRAS exon 2 (c12/13)   Failure KRAS exon 3 (c61)      <NA>
## 6 KRAS exon 2 (c12/13)    Mutant KRAS exon 3 (c61)      <NA>
##                  BMMTNM3 BMMTR3              BMMTNM4    BMMTR4
## 1 KRAS exon 4 (c117/146)   <NA> NRAS exon 2 (c12/13)      <NA>
## 2 KRAS exon 4 (c117/146)   <NA> NRAS exon 2 (c12/13)      <NA>
## 3 KRAS exon 4 (c117/146)   <NA> NRAS exon 2 (c12/13)      <NA>
## 4 KRAS exon 4 (c117/146) Mutant NRAS exon 2 (c12/13) Wild-type
## 5 KRAS exon 4 (c117/146)   <NA> NRAS exon 2 (c12/13)      <NA>
## 6 KRAS exon 4 (c117/146)   <NA> NRAS exon 2 (c12/13)      <NA>
##             BMMTNM5    BMMTR5                BMMTNM6    BMMTR6
## 1 NRAS exon 3 (c61)      <NA> NRAS exon 4 (c117/146)      <NA>
## 2 NRAS exon 3 (c61)      <NA> NRAS exon 4 (c117/146)      <NA>
## 3 NRAS exon 3 (c61)      <NA> NRAS exon 4 (c117/146)      <NA>
## 4 NRAS exon 3 (c61) Wild-type NRAS exon 4 (c117/146) Wild-type
## 5 NRAS exon 3 (c61)      <NA> NRAS exon 4 (c117/146)      <NA>
## 6 NRAS exon 3 (c61)      <NA> NRAS exon 4 (c117/146)      <NA>
##               BMMTNM7    BMMTR7              BMMTNM15   BMMTR15
## 1 BRAF exon 15 (c600)      <NA> KRAS exon 3 (c59/c61)      <NA>
## 2 BRAF exon 15 (c600)      <NA> KRAS exon 3 (c59/c61)      <NA>
## 3 BRAF exon 15 (c600)      <NA> KRAS exon 3 (c59/c61)      <NA>
## 4 BRAF exon 15 (c600) Wild-type KRAS exon 3 (c59/c61) Wild-type
## 5 BRAF exon 15 (c600)      <NA> KRAS exon 3 (c59/c61)      <NA>
## 6 BRAF exon 15 (c600)      <NA> KRAS exon 3 (c59/c61)      <NA>
##                BMMTNM16   BMMTR16  BMMTR   AE agecat61
## 1 NRAS exon 3 (c59/c61)      <NA> Mutant  Low      old
## 2 NRAS exon 3 (c59/c61)      <NA> Mutant High    young
## 3 NRAS exon 3 (c59/c61)      <NA>   <NA>  Low    young
## 4 NRAS exon 3 (c59/c61) Wild-type Mutant  Low    young
## 5 NRAS exon 3 (c59/c61)      <NA>   <NA>  Low      old
## 6 NRAS exon 3 (c59/c61)      <NA> Mutant  Low      old
fit.age <- survfit(Surv(PFSDYCR, PFSCR) ~ agecat61, data=df_final1)
print(fit.age)
## Call: survfit(formula = Surv(PFSDYCR, PFSCR) ~ agecat61, data = df_final1)
## 
##                  n events median 0.95LCL 0.95UCL
## agecat61=young 178    152    285     234     332
## agecat61=old   430    376    280     238     290
survfit2(Surv(PFSDYCR, PFSCR) ~ agecat61, data=df_final1) %>%
  ggsurvfit(linewidth = 1) +
  add_confidence_interval() + # add confidence interval
  add_risktable() + # Add risk table
  add_quantile(y_value = 0.5, color = "gray50", linewidth = 0.75)+  # Specify median survival
  labs(title = "Kaplan-Meier Curve by age group",
       x="Follow-up time (days)")+
  scale_x_continuous(breaks = seq(0,1000,by=90))+
  scale_color_discrete("Age group")+
  scale_fill_discrete("Age group")+
  add_pvalue("annotation",x=500)

df_final1$BMI <- with(df_final1, (as.numeric(B_WEIGHT) / as.numeric(B_HEIGHT)^2) * 10000) 

df_final1$BMI_category <- cut(df_final1$BMI, breaks=c(0, 18.5, 30, Inf), labels=c("underweight", "normal weight", "overweight/obese"))
head(df_final1)
##   SUBJID                 AEPT                                  AESOC AESTDYI
## 1      3 Dermatitis acneiform Skin and subcutaneous tissue disorders      12
## 2      7         Folliculitis            Infections and infestations     245
## 3      9     Exfoliative rash Skin and subcutaneous tissue disorders       3
## 4     13                 Rash Skin and subcutaneous tissue disorders      37
## 5     15             Dry skin Skin and subcutaneous tissue disorders      72
## 6     17            Urticaria Skin and subcutaneous tissue disorders     175
##   AEENDYI AESEVCD AECONT                  TRT                 ATRT PRSURG DTHDY
## 1      29       1      N Panitumumab + FOLFOX Panitumumab + FOLFOX      Y   301
## 2     252       3      N Panitumumab + FOLFOX Panitumumab + FOLFOX      Y   339
## 3      14       2      N Panitumumab + FOLFOX Panitumumab + FOLFOX      Y   696
## 4      55       2      N Panitumumab + FOLFOX Panitumumab + FOLFOX      Y   197
## 5     157       2      N Panitumumab + FOLFOX Panitumumab + FOLFOX      Y   414
## 6     175       2      N         FOLFOX alone         FOLFOX alone      Y  1043
##   DTH PFSDYCR PFSCR LIVERMET             DIAGMONS AGE    SEX B_WEIGHT B_HEIGHT
## 1   1      54     1        Y 1.86885245901639E+16  65   Male       67      165
## 2   1     339     1        Y 2.78688524590163E+16  52 Female       53      160
## 3   1     301     1        Y     7570491803278680  42 Female       69      148
## 4   1      73     1        Y  2.0327868852459E+16  54   Male       62      160
## 5   1     414     1        Y      380327868852459  67 Female     48.8      155
## 6   0     226     0        Y 3.14754098360655E+16  57 Female     63.6      149
##                 RACE                  B_ECOG    HISSUBTY B_METANM DIAGTYPE
## 1 White or Caucasian Symptoms but ambulatory No sub-type        5    Colon
## 2 Hispanic or Latino Symptoms but ambulatory    Mucinous        3    Colon
## 3 Hispanic or Latino            Fully active No sub-type        4    Colon
## 4 White or Caucasian            Fully active       Other        6   Rectal
## 5 White or Caucasian Symptoms but ambulatory No sub-type        2   Rectal
## 6 White or Caucasian            Fully active No sub-type        2    Colon
##                BMMTNM1    BMMTR1           BMMTNM2    BMMTR2
## 1 KRAS exon 2 (c12/13)    Mutant KRAS exon 3 (c61)      <NA>
## 2 KRAS exon 2 (c12/13)    Mutant KRAS exon 3 (c61)      <NA>
## 3 KRAS exon 2 (c12/13)   Failure KRAS exon 3 (c61)      <NA>
## 4 KRAS exon 2 (c12/13) Wild-type KRAS exon 3 (c61) Wild-type
## 5 KRAS exon 2 (c12/13)   Failure KRAS exon 3 (c61)      <NA>
## 6 KRAS exon 2 (c12/13)    Mutant KRAS exon 3 (c61)      <NA>
##                  BMMTNM3 BMMTR3              BMMTNM4    BMMTR4
## 1 KRAS exon 4 (c117/146)   <NA> NRAS exon 2 (c12/13)      <NA>
## 2 KRAS exon 4 (c117/146)   <NA> NRAS exon 2 (c12/13)      <NA>
## 3 KRAS exon 4 (c117/146)   <NA> NRAS exon 2 (c12/13)      <NA>
## 4 KRAS exon 4 (c117/146) Mutant NRAS exon 2 (c12/13) Wild-type
## 5 KRAS exon 4 (c117/146)   <NA> NRAS exon 2 (c12/13)      <NA>
## 6 KRAS exon 4 (c117/146)   <NA> NRAS exon 2 (c12/13)      <NA>
##             BMMTNM5    BMMTR5                BMMTNM6    BMMTR6
## 1 NRAS exon 3 (c61)      <NA> NRAS exon 4 (c117/146)      <NA>
## 2 NRAS exon 3 (c61)      <NA> NRAS exon 4 (c117/146)      <NA>
## 3 NRAS exon 3 (c61)      <NA> NRAS exon 4 (c117/146)      <NA>
## 4 NRAS exon 3 (c61) Wild-type NRAS exon 4 (c117/146) Wild-type
## 5 NRAS exon 3 (c61)      <NA> NRAS exon 4 (c117/146)      <NA>
## 6 NRAS exon 3 (c61)      <NA> NRAS exon 4 (c117/146)      <NA>
##               BMMTNM7    BMMTR7              BMMTNM15   BMMTR15
## 1 BRAF exon 15 (c600)      <NA> KRAS exon 3 (c59/c61)      <NA>
## 2 BRAF exon 15 (c600)      <NA> KRAS exon 3 (c59/c61)      <NA>
## 3 BRAF exon 15 (c600)      <NA> KRAS exon 3 (c59/c61)      <NA>
## 4 BRAF exon 15 (c600) Wild-type KRAS exon 3 (c59/c61) Wild-type
## 5 BRAF exon 15 (c600)      <NA> KRAS exon 3 (c59/c61)      <NA>
## 6 BRAF exon 15 (c600)      <NA> KRAS exon 3 (c59/c61)      <NA>
##                BMMTNM16   BMMTR16  BMMTR   AE agecat61      BMI
## 1 NRAS exon 3 (c59/c61)      <NA> Mutant  Low      old 24.60973
## 2 NRAS exon 3 (c59/c61)      <NA> Mutant High    young 20.70312
## 3 NRAS exon 3 (c59/c61)      <NA>   <NA>  Low    young 31.50110
## 4 NRAS exon 3 (c59/c61) Wild-type Mutant  Low    young 24.21875
## 5 NRAS exon 3 (c59/c61)      <NA>   <NA>  Low      old 20.31217
## 6 NRAS exon 3 (c59/c61)      <NA> Mutant  Low      old 28.64736
##       BMI_category
## 1    normal weight
## 2    normal weight
## 3 overweight/obese
## 4    normal weight
## 5    normal weight
## 6    normal weight
fit.age <- survfit(Surv(PFSDYCR, PFSCR) ~ BMI_category, data=df_final1)
print(fit.age)
## Call: survfit(formula = Surv(PFSDYCR, PFSCR) ~ BMI_category, data = df_final1)
## 
##    1 observation deleted due to missingness 
##                                 n events median 0.95LCL 0.95UCL
## BMI_category=underweight       23     21    281     180     405
## BMI_category=normal weight    476    415    278     240     292
## BMI_category=overweight/obese 108     91    285     234     328
survfit2(Surv(PFSDYCR, PFSCR) ~ BMI_category, data=df_final1) %>%
  ggsurvfit(linewidth = 1) +
  add_confidence_interval() + # add confidence interval
  add_risktable() + # Add risk table
  add_quantile(y_value = 0.5, color = "gray50", linewidth = 0.75)+  # Specify median survival
  labs(title = "Kaplan-Meier Curve by BMI group",
       x="Follow-up time (days)")+
  scale_x_continuous(breaks = seq(0,1000,by=90))+
  scale_color_discrete("BMI group")+
  scale_fill_discrete("BMI group")+
  add_pvalue("annotation",x=500)

# Add variables in a binary form
df_final1$race <- ifelse(df_final1$RACE %in% "White or Caucasian", "White or Caucasian", "Other")

df_final1$ECOG <- ifelse(df_final1$B_ECOG %in% c("Fully active", "Symptoms but ambulatory"), "0 or 1", "2")

df_final1$Primary_Tumor <- ifelse(df_final1$DIAGTYPE %in% "Colon", "Colon", "Rectal")

df_final1$Number_of_sites <- ifelse(df_final1$B_METANM == "1", "1",
                             ifelse(df_final1$B_METANM == "2", "2",
                             ifelse(df_final1$B_METANM >= 3, ">3", as.character(df_final1$B_METANM))))

df_final1$Location_of_site <- ifelse(df_final1$LIVERMET == "Y", "Liver", "Other")

colon2.1 <- subset(colon2, colon2$LBBASE == "Y" & colon2$LBTEST == "Lactate Dehydrogenase")

df_final1 <- merge(df_final1, colon2.1, by.x = "SUBJID" )

df_final1 <- df_final1[, !names(df_final1) %in% c("LBTEST", "LBBASE", "VISITDY", "VISIT", "LBSTUNIT")]

df_final1$Baseline_LDH1 <- ifelse(df_final1$LBSTRESN < 150 * 1.5, "< 1.5 x ULN",
                             ifelse(df_final1$LBSTRESN >= 150 * 1.5, ">= 1.5 x ULN", "other"))

df_final1$Baseline_LDH2 <- ifelse(df_final1$LBSTRESN < 200 * 2, "< 2 x ULN",
                             ifelse(df_final1$LBSTRESN >= 200 * 2, ">= 2 x ULN", "other"))

df_final1$Age1 <- ifelse(df_final1$AGE < 65, "<65", ">=65")
df_final1$Age2 <- ifelse(df_final1$AGE < 75, "<75", ">=75")
# Caricare le librerie necessarie
library(survival)
library(dplyr)
library(forestplot)
## Loading required package: grid
## Loading required package: checkmate
## Loading required package: abind
library(survminer)

# Creare un nuovo dataframe con tutte le variabili di interesse
df_final1 <- df_final1 %>%
  mutate(AE = ifelse(AESEVCD %in% c(1, 2), "Low", "High"),
         TRT = ifelse(TRT %in% "Panitumumab + FOLFOX", "Panitumumab + FOLFOX", "FOLFOX alone"),
         race = ifelse(RACE %in% "White or Caucasian", "White or Caucasian", "Other"),
         ECOG = ifelse(B_ECOG %in% c("Fully active", "Symptoms but ambulatory"), "0 or 1", "2"),
         Primary_Tumor = ifelse(DIAGTYPE %in% "Colon", "Colon", "Rectal"),
         Number_of_sites = ifelse(B_METANM == 1, "1",
                                  ifelse(B_METANM == 2, "2",
                                         ifelse(B_METANM >= 3, ">3", as.character(B_METANM)))),
         Location_of_site = ifelse(LIVERMET == "Y", "Liver", "Other"),
         Baseline_LDH1 = ifelse(LBSTRESN < 150 * 1.5, "< 1.5 x ULN",
                                ifelse(LBSTRESN >= 150 * 1.5, ">= 1.5 x ULN", "other")),
         Baseline_LDH2 = ifelse(LBSTRESN < 200 * 2, "< 2 x ULN",
                                ifelse(LBSTRESN >= 200 * 2, ">= 2 x ULN", "other")),
         Age1 = ifelse(AGE < 65, "<65", ">=65"),
         Age2 = ifelse(AGE < 75, "<75", ">=75"))

# Modello Cox PH con tutte le variabili
cox_model_full <- coxph(Surv(PFSDYCR, PFSCR) ~ AE + TRT + BMMTR + ECOG + Primary_Tumor + Number_of_sites + 
                          Location_of_site + Baseline_LDH1 + Baseline_LDH2 + Age1 + 
                          Age2 + SEX + race + PRSURG + BMI_category, data = df_final1)

# Sommario del modello
summary_cox_full <- summary(cox_model_full)

coef(cox_model_full)
##                        AELow      TRTPanitumumab + FOLFOX 
##                  0.204926907                  0.097100453 
##               BMMTRWild-type                        ECOG2 
##                 -0.533819479                  0.638106125 
##          Primary_TumorRectal             Number_of_sites1 
##                 -0.070345847                 -0.172415606 
##             Number_of_sites2        Location_of_siteOther 
##                 -0.128567807                  0.165999001 
##    Baseline_LDH1>= 1.5 x ULN      Baseline_LDH2>= 2 x ULN 
##                  0.278796683                  0.017285654 
##                     Age1>=65                     Age2>=75 
##                  0.091164105                  0.101553984 
##                      SEXMale       raceWhite or Caucasian 
##                 -0.008261972                  0.065201933 
##                      PRSURGY    BMI_categorynormal weight 
##                 -0.305881121                  0.120761565 
## BMI_categoryoverweight/obese 
##                  0.018181541
# Estrarre i risultati
HR <- exp(coef(cox_model_full))  # Hazard Ratios
CI_lower <- exp(confint(cox_model_full)[, 1])  # Limiti inferiori degli intervalli di confidenza
CI_upper <- exp(confint(cox_model_full)[, 2])  # Limiti superiori degli intervalli di confidenza
p_values <- summary_cox_full$coefficients[, 5]  # P-values

# Preparazione dei dati per il forest plot
forest_data <- data.frame(
  Factors = c("Adverse Events Grade: Low", "Treatment: Panitumumab + FOLFOX", "KRAS: WT", "ECOG: 2 (in bed < 50% of the time)", "Primary Tumor: Rectal", "Number of Sites: 1", "Number of Sites: 2", "Location of Site: Other", "Baseline LDH1: >= 1.5 x ULN", "Baseline LDH2: >= 2 x ULN", "Age: >=65", "Age: >=75", "Sex: Male", "Race: White or Caucasian", "Pre-Surgery", "BMI category: normal weight", "BMI category: overweight/obese"),
  HR = round(HR, 2),
  Lower_CI = round(CI_lower, 2),
  Upper_CI = round(CI_upper, 2),
  p_value = round(p_values, 3)
)

# Preparare i testi da mostrare nella colonna di etichette
labels <- cbind(
  forest_data$Factors,
  sprintf("%.2f", forest_data$HR),
  sprintf("%.2f - %.2f", forest_data$Lower_CI, forest_data$Upper_CI)
)

# Creare il forest plot
forestplot(
  labeltext = labels,
  mean = forest_data$HR,
  lower = forest_data$Lower_CI,
  upper = forest_data$Upper_CI,
  title = "Hazard Ratio by Factor",
  xlab = "Hazard Ratio (log scale)",
  is.summary = rep(FALSE, nrow(forest_data)),
  xlog = TRUE,
  xticks = c(0.5, 1, 2),
  col = forestplot::fpColors(box = "blue", line = "darkblue", summary = "royalblue"),
  lineheight = unit(1.5, "cm"),  # Aumenta l'altezza delle linee per spazio tra le etichette
  boxsize = 0.2,  # Riduci la dimensione dei box per una visualizzazione più pulita
  clip = c(0.5, 2.5),  # Limita l'asse x per una visualizzazione più chiara
  txt_gp = fpTxtGp(
    label = gpar(fontsize = 10),  # Aumenta la dimensione del testo per le etichette
    xlab = gpar(fontsize = 12),  # Aumenta la dimensione del testo per l'asse x
    title = gpar(fontsize = 14, fontface = "bold")  # Aumenta la dimensione del titolo
  )
)

library(survival)
library(survminer)
library(cowplot)
## 
## Attaching package: 'cowplot'
## The following object is masked from 'package:ggpubr':
## 
##     get_legend
# Fit the Cox PH model
cox_model <- coxph(Surv(PFSDYCR, PFSCR) ~ AE + TRT + BMMTR + ECOG + Primary_Tumor + Number_of_sites + 
                          Location_of_site + Baseline_LDH1 + Baseline_LDH2 + Age1 + 
                          Age2 + SEX + race + PRSURG + BMI_category, data = df_final1)

# Schoenfeld Residuals Test
cox_zph <- cox.zph(cox_model)

# Print the test results
print(cox_zph)
##                   chisq df    p
## AE                0.434  1 0.51
## TRT               0.204  1 0.65
## BMMTR             0.481  1 0.49
## ECOG              2.312  1 0.13
## Primary_Tumor     1.490  1 0.22
## Number_of_sites   1.494  2 0.47
## Location_of_site  1.018  1 0.31
## Baseline_LDH1     0.917  1 0.34
## Baseline_LDH2     1.063  1 0.30
## Age1              0.175  1 0.68
## Age2              0.174  1 0.68
## SEX               0.105  1 0.75
## race              0.701  1 0.40
## PRSURG            0.261  1 0.61
## BMI_category      0.322  2 0.85
## GLOBAL           10.417 17 0.89
# Plot diagnostics

# Schoenfeld Residuals Test
cox_zph <- cox.zph(cox_model_full)

# Generate diagnostic plots
ggcoxzph(cox_zph)

plot_dfbeta <- ggcoxdiagnostics(cox_model_full, type = "dfbeta", linear.predictions = FALSE, ggtheme = theme_bw())
## Warning: `gather_()` was deprecated in tidyr 1.2.0.
## ℹ Please use `gather()` instead.
## ℹ The deprecated feature was likely used in the survminer package.
##   Please report the issue at <https://github.com/kassambara/survminer/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
plot_deviance <- ggcoxdiagnostics(cox_model_full, type = "deviance", linear.predictions = FALSE, ggtheme = theme_bw())
plot_martingale <- ggcoxdiagnostics(cox_model_full, type = "martingale", linear.predictions = FALSE, ggtheme = theme_bw())

plot_dfbeta
## `geom_smooth()` using formula = 'y ~ x'

# Combine the plots into a single grid
combined_plot <- plot_grid(plot_deviance, plot_martingale, labels = "AUTO")
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
# Display the combined plot
print(combined_plot)

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ lubridate 1.9.3     ✔ tibble    3.2.1
## ✔ purrr     1.0.2     ✔ tidyr     1.3.1
## ✔ readr     2.1.5     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ gridExtra::combine() masks dplyr::combine()
## ✖ tidyr::extract()     masks magrittr::extract()
## ✖ dplyr::filter()      masks stats::filter()
## ✖ dplyr::lag()         masks stats::lag()
## ✖ purrr::set_names()   masks magrittr::set_names()
## ✖ lubridate::stamp()   masks cowplot::stamp()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
# Filtrare solo i dati di screening
colon_screening <- colon2 %>% filter(VISIT == "Screening")
# Riorganizzare i dati con pivot_wider
colon_wide <- colon_screening %>%
  select(SUBJID, LBTEST, LBSTRESN) %>%
  pivot_wider(names_from = LBTEST, values_from = LBSTRESN)
## Warning: Values from `LBSTRESN` are not uniquely identified; output will contain
## list-cols.
## • Use `values_fn = list` to suppress this warning.
## • Use `values_fn = {summary_fun}` to summarise duplicates.
## • Use the following dplyr code to identify duplicates.
##   {data} |>
##   dplyr::summarise(n = dplyr::n(), .by = c(SUBJID, LBTEST)) |>
##   dplyr::filter(n > 1L)
# Convertire i dati in numerico
colon_wide <- colon_wide %>%
  mutate(across(everything(), ~ as.numeric(as.character(.))))
## Warning: There were 8 warnings in `mutate()`.
## The first warning was:
## ℹ In argument: `across(everything(), ~as.numeric(as.character(.)))`.
## Caused by warning:
## ! NAs introduced by coercion
## ℹ Run `dplyr::last_dplyr_warnings()` to see the 7 remaining warnings.
# Rimuovere le righe con NA
colon_wide <- colon_wide %>% drop_na()

# Identificare i duplicati
colon_wide <- colon_wide %>% distinct(SUBJID, .keep_all = TRUE)

# Controllare i dati riorganizzati
print(colon_wide)
## # A tibble: 816 × 9
##    SUBJID Albumin `Alkaline Phosphatase` Creatinine `Lactate Dehydrogenase`
##     <dbl>   <dbl>                  <dbl>      <dbl>                   <dbl>
##  1      3 3.55e 1                    314    9.99e15                     271
##  2      4 3.4 e 1                    475    6.8 e 1                     666
##  3      5 3.7 e 1                    122    6.19e 1                     381
##  4      6 2.88e 1                    100    7.25e 4                     249
##  5      7 3.4 e 1                    253    7.07e 1                     309
##  6      9 5.1 e 1                    110    6.19e 1                     214
##  7     10 3.8 e 1                    313    8.04e 4                     749
##  8     13 4.19e 1                    308    4.42e 1                     308
##  9     15 4.4 e 1                     61    7.07e 1                     280
## 10     16 3.78e16                    365    9.28e15                    1131
## # ℹ 806 more rows
## # ℹ 4 more variables: Hemoglobin <dbl>, Platelets <dbl>,
## #   `White Blood Cells` <dbl>, `Carcinoembryonic Antigen` <dbl>
# Unire il dataframe riorganizzato con il dataframe principale
df_final1 <- merge(df_final1, colon_wide, by = "SUBJID")

# Controllare i dati uniti
print(head(df_final1))
##   SUBJID                 AEPT                                  AESOC AESTDYI
## 1      3 Dermatitis acneiform Skin and subcutaneous tissue disorders      12
## 2      7         Folliculitis            Infections and infestations     245
## 3      9     Exfoliative rash Skin and subcutaneous tissue disorders       3
## 4     13                 Rash Skin and subcutaneous tissue disorders      37
## 5     15             Dry skin Skin and subcutaneous tissue disorders      72
## 6     17            Urticaria Skin and subcutaneous tissue disorders     175
##   AEENDYI AESEVCD AECONT                  TRT                 ATRT PRSURG DTHDY
## 1      29       1      N Panitumumab + FOLFOX Panitumumab + FOLFOX      Y   301
## 2     252       3      N Panitumumab + FOLFOX Panitumumab + FOLFOX      Y   339
## 3      14       2      N Panitumumab + FOLFOX Panitumumab + FOLFOX      Y   696
## 4      55       2      N Panitumumab + FOLFOX Panitumumab + FOLFOX      Y   197
## 5     157       2      N Panitumumab + FOLFOX Panitumumab + FOLFOX      Y   414
## 6     175       2      N         FOLFOX alone         FOLFOX alone      Y  1043
##   DTH PFSDYCR PFSCR LIVERMET             DIAGMONS AGE    SEX B_WEIGHT B_HEIGHT
## 1   1      54     1        Y 1.86885245901639E+16  65   Male       67      165
## 2   1     339     1        Y 2.78688524590163E+16  52 Female       53      160
## 3   1     301     1        Y     7570491803278680  42 Female       69      148
## 4   1      73     1        Y  2.0327868852459E+16  54   Male       62      160
## 5   1     414     1        Y      380327868852459  67 Female     48.8      155
## 6   0     226     0        Y 3.14754098360655E+16  57 Female     63.6      149
##                 RACE                  B_ECOG    HISSUBTY B_METANM DIAGTYPE
## 1 White or Caucasian Symptoms but ambulatory No sub-type        5    Colon
## 2 Hispanic or Latino Symptoms but ambulatory    Mucinous        3    Colon
## 3 Hispanic or Latino            Fully active No sub-type        4    Colon
## 4 White or Caucasian            Fully active       Other        6   Rectal
## 5 White or Caucasian Symptoms but ambulatory No sub-type        2   Rectal
## 6 White or Caucasian            Fully active No sub-type        2    Colon
##                BMMTNM1    BMMTR1           BMMTNM2    BMMTR2
## 1 KRAS exon 2 (c12/13)    Mutant KRAS exon 3 (c61)      <NA>
## 2 KRAS exon 2 (c12/13)    Mutant KRAS exon 3 (c61)      <NA>
## 3 KRAS exon 2 (c12/13)   Failure KRAS exon 3 (c61)      <NA>
## 4 KRAS exon 2 (c12/13) Wild-type KRAS exon 3 (c61) Wild-type
## 5 KRAS exon 2 (c12/13)   Failure KRAS exon 3 (c61)      <NA>
## 6 KRAS exon 2 (c12/13)    Mutant KRAS exon 3 (c61)      <NA>
##                  BMMTNM3 BMMTR3              BMMTNM4    BMMTR4
## 1 KRAS exon 4 (c117/146)   <NA> NRAS exon 2 (c12/13)      <NA>
## 2 KRAS exon 4 (c117/146)   <NA> NRAS exon 2 (c12/13)      <NA>
## 3 KRAS exon 4 (c117/146)   <NA> NRAS exon 2 (c12/13)      <NA>
## 4 KRAS exon 4 (c117/146) Mutant NRAS exon 2 (c12/13) Wild-type
## 5 KRAS exon 4 (c117/146)   <NA> NRAS exon 2 (c12/13)      <NA>
## 6 KRAS exon 4 (c117/146)   <NA> NRAS exon 2 (c12/13)      <NA>
##             BMMTNM5    BMMTR5                BMMTNM6    BMMTR6
## 1 NRAS exon 3 (c61)      <NA> NRAS exon 4 (c117/146)      <NA>
## 2 NRAS exon 3 (c61)      <NA> NRAS exon 4 (c117/146)      <NA>
## 3 NRAS exon 3 (c61)      <NA> NRAS exon 4 (c117/146)      <NA>
## 4 NRAS exon 3 (c61) Wild-type NRAS exon 4 (c117/146) Wild-type
## 5 NRAS exon 3 (c61)      <NA> NRAS exon 4 (c117/146)      <NA>
## 6 NRAS exon 3 (c61)      <NA> NRAS exon 4 (c117/146)      <NA>
##               BMMTNM7    BMMTR7              BMMTNM15   BMMTR15
## 1 BRAF exon 15 (c600)      <NA> KRAS exon 3 (c59/c61)      <NA>
## 2 BRAF exon 15 (c600)      <NA> KRAS exon 3 (c59/c61)      <NA>
## 3 BRAF exon 15 (c600)      <NA> KRAS exon 3 (c59/c61)      <NA>
## 4 BRAF exon 15 (c600) Wild-type KRAS exon 3 (c59/c61) Wild-type
## 5 BRAF exon 15 (c600)      <NA> KRAS exon 3 (c59/c61)      <NA>
## 6 BRAF exon 15 (c600)      <NA> KRAS exon 3 (c59/c61)      <NA>
##                BMMTNM16   BMMTR16  BMMTR   AE agecat61      BMI
## 1 NRAS exon 3 (c59/c61)      <NA> Mutant  Low      old 24.60973
## 2 NRAS exon 3 (c59/c61)      <NA> Mutant High    young 20.70312
## 3 NRAS exon 3 (c59/c61)      <NA>   <NA>  Low    young 31.50110
## 4 NRAS exon 3 (c59/c61) Wild-type Mutant  Low    young 24.21875
## 5 NRAS exon 3 (c59/c61)      <NA>   <NA>  Low      old 20.31217
## 6 NRAS exon 3 (c59/c61)      <NA> Mutant  Low      old 28.64736
##       BMI_category               race   ECOG Primary_Tumor Number_of_sites
## 1    normal weight White or Caucasian 0 or 1         Colon              >3
## 2    normal weight              Other 0 or 1         Colon              >3
## 3 overweight/obese              Other 0 or 1         Colon              >3
## 4    normal weight White or Caucasian 0 or 1        Rectal              >3
## 5    normal weight White or Caucasian 0 or 1        Rectal               2
## 6    normal weight White or Caucasian 0 or 1         Colon               2
##   Location_of_site LBSTRESN Baseline_LDH1 Baseline_LDH2 Age1 Age2 Albumin
## 1            Liver      271  >= 1.5 x ULN     < 2 x ULN >=65  <75    35.5
## 2            Liver      309  >= 1.5 x ULN     < 2 x ULN  <65  <75    34.0
## 3            Liver      214   < 1.5 x ULN     < 2 x ULN  <65  <75    51.0
## 4            Liver      308  >= 1.5 x ULN     < 2 x ULN  <65  <75    41.9
## 5            Liver      280  >= 1.5 x ULN     < 2 x ULN >=65  <75    44.0
## 6            Liver       93  >= 1.5 x ULN    >= 2 x ULN  <65  <75    38.0
##   Alkaline Phosphatase Creatinine Lactate Dehydrogenase Hemoglobin Platelets
## 1                  314 9.9892e+15                   271        103       279
## 2                  253 7.0720e+01                   309        100       527
## 3                  110 6.1880e+01                   214        144       294
## 4                  308 4.4200e+01                   308        124       625
## 5                   61 7.0720e+01                   280        105       296
## 6                  105 5.7000e+01                    85        128       284
##   White Blood Cells Carcinoembryonic Antigen
## 1         5.600e+16                7.200e+00
## 2         7.100e+16                1.210e+02
## 3         8.600e+00                4.166e+16
## 4         1.162e+16                5.000e-01
## 5         9.400e+00                8.728e+01
## 6         8.100e+00                2.900e+01
df_final1$AE <- ifelse(df_final1$AE == "Low", 0, 1)

# Fit the logistic regression model
logistic_model <- glm(AE ~ `Albumin` + `Alkaline Phosphatase` + `Creatinine` + 
                      `Lactate Dehydrogenase` + `Hemoglobin` + `Platelets` + 
                      `White Blood Cells` + `Carcinoembryonic Antigen` + TRT,
                      family = binomial, data = df_final1)

# Summarize the model
summary(logistic_model)
## 
## Call:
## glm(formula = AE ~ Albumin + `Alkaline Phosphatase` + Creatinine + 
##     `Lactate Dehydrogenase` + Hemoglobin + Platelets + `White Blood Cells` + 
##     `Carcinoembryonic Antigen` + TRT, family = binomial, data = df_final1)
## 
## Coefficients:
##                              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                -2.640e+00  3.488e-01  -7.568 3.79e-14 ***
## Albumin                     5.356e-18  1.358e-17   0.394   0.6933    
## `Alkaline Phosphatase`     -6.481e-17  7.937e-17  -0.817   0.4142    
## Creatinine                  2.751e-17  1.585e-17   1.736   0.0826 .  
## `Lactate Dehydrogenase`    -2.832e-17  6.044e-17  -0.469   0.6394    
## Hemoglobin                  2.154e-17  8.691e-17   0.248   0.8043    
## Platelets                  -6.721e-16  3.213e-14  -0.021   0.9833    
## `White Blood Cells`        -2.533e-18  5.609e-18  -0.452   0.6516    
## `Carcinoembryonic Antigen` -2.858e-18  1.058e-17  -0.270   0.7871    
## TRTPanitumumab + FOLFOX     2.050e+00  3.625e-01   5.656 1.55e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 635.38  on 532  degrees of freedom
## Residual deviance: 577.61  on 523  degrees of freedom
## AIC: 597.61
## 
## Number of Fisher Scoring iterations: 14